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Background/Aims: Recent studies have demonstrated that magniiying endoscopy with narrow band imaging (ME-NBI) facilitates dif- 
ferentiation of early gastric cancer from gastric adenoma using vessel plus surface (VS) classification. This study estimated the interob- 
server and intraobserver agreement of endoscopists using the Yao VS classification system for the gastric mucosal surface. 

Methods: We retrospectively reviewed patients who underwent endoscopic submucosal dissection or endoscopic mucosal resection, 
and selected cases in which preoperative ME-NBI was conducted. Before testing endoscopists, a 20-minute training module was given. 
Static ME-NBI images («=47 cases) were presented to seven endoscopists (two experts and five trainees) who were asked to assess the 
images in 20 seconds using the Yao VS classification system. After 2 weeks, the endoscopists were asked to analyze the images again. The 
K statistic was calculated for intraobserver and interobserver variability. 

Results: The mean k for intraobserver agreement was 0.69 (experts, 0.74; trainees, 0.64). The mean k for interobserver agreement was 
0.42 (experts, 0.49; trainees, 0.40). 

Conclusions: We obtained reliable results as assessed by observer variability, with only brief training on VS classification. The VS classi- 
fication appears to provide an objective assessment of ME-NBI for trainees who are not familiar with ME-NBI. 
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INTRODUCTION 

The ability to distinguish between benign and malignant le- 
sions by endoscopy is critical. However, endoscopy using con- 
ventional white light imaging (C- WLI) alone is insufficient for 
accurate diagnosis.' Histological findings after resection some- 
times show that low-grade adenoma has transitioned to high- 
grade adenoma or early cancer. Pathologists have difficulty mak- 
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ing an accurate diagnosis using only a small biopsy specimen, 
and endoscopists are challenged to obtain a target biopsy from 
ffie most suspicious part of the lesion using C- WLI endoscopy 
alone. Differentiating low-grade adenoma, high-grade adeno- 
ma, and early cancer is therefore difficult, even after histologi- 
cal biopsy-based examination.^ 

Magnifying endoscopy with narrow band imaging (ME- 
NBI) is a powerful tool for diagnosing superficial neoplasms 
in the gastrointestinal tract. The NBI system is an endoscopic 
imaging technique for ffie enhanced visualization of microvas- 
cular (MV) architecture and microsurface (MS) structures of 
ffie superficial part of the mucosa.' Combining ME and an NBI 
system allows simple and clear visualization of MS structures 
and MV patterns of ffie superficial mucosa, which could be use- 
ffil for obtaining a precise endoscopic diagnosis that matches 
the histopathological diagnosis. 

Several studies have shown the potential utility of micro- 
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scopic capillaries, which can be visualized by ME-NBI, for 
predicting gastric neoplasia among superficially depressed, 
flat, or elevated, early gastric neoplastic lesions/ 

Endoscopic diagnosis is subjective and often shows relative- 
ly low interobserver concordance. In a review of the ME-NBI 
literature, we found few studies on observer agreement, with 
most studies involving expert endoscopists. This study aimed 
to estimate interobserver and intraobserver variability based 
on the vessel plus surface (VS) classification of Yao et al.' 

MATERIALS AND METHODS 

Patients and ME-NBI images 

We retrospectively evaluated our endoscopic submucosal 
dissection (ESD) or endoscopic mucosal resection (EMR) da- 
tabases for patients treated at the Kosin University Gospel 
Hospital, Busan, Korea from January 201 1 through June 2012. 
Endoscopic examinations, ESD (n=99), and EMR (n=199) 
were performed by two expert endoscopists. Pretreatment 
ME-NBI endoscopic examinations were performed before 32 
ESDs and 18 EMRs. A GIF-H260Z endoscope (CLV 260 SL; 
Olympus, Tokyo, Japan) was used for preoperative endoscopic 
examinations. Pathological diagnostic criteria were based on 
the revised Vienna classification: four (noninvasive high-grade 
neoplasia) or five (invasive neoplasia) lesions were regarded as 
carcinoma, whereas three (norunvasive low-grade neoplasia) 
lesions were regarded as adenoma.* 

We analyzed both depressed and nondepressed lesions. We 
excluded one case in which the image was unfocused, making 
discrimination difficult. The images in two cases were used as 
training images (Figs. 1, 2). To minimize selection bias, ME- 
NBI images were selected by an endoscopist who did not per- 
form the endoscopic examinations and who was blinded to 
the clinical information. In total, 47 cases were selected. Two 
static ME-NBI images per case were selected. We selected only 
ME-NBI images without white Ught endoscopy imaging to 
avoid affecting the judgment of the endoscopists. 

Education and review of investigators 

Seven endoscopists were divided into two groups: two expe- 
rienced endoscopists who had used ME-NBI occasionally for 
>3 years, and five trainees who had been performing endosco- 
pies for <1 year and had never used ME-NBI. We first ex- 
plained the VS classification system to the endoscopists using 
the image from the original publication by Yao et al.' and static 
images from two of our cases. 

ME-NBI diagnostic criteria were based on the VS classifica- 
tion system proposed by Yao et al.,' which describes lesions as 
having 1) an irregular MV pattern with a demarcation Une be- 
tween the lesion and the surrounding area, and 2) an irregular 




Fig. 2. Magnifying endoscopy with narrow band imaging showing 
a typical finding of regular microvascular and microsurface pat- 
terns 



MS pattern with a demarcation Une between the lesion and the 
surrounding area. If a target lesion has a finding of 1) or 2), it 
is considered positive. 

We did not provide any information that might have affect- 
ed the endoscopists' assessments. The seven endoscopists as- 
sessed the VS pattern of 47 cases (two images per case) for 20 
seconds per case. Two weeks later, under the same conditions, 
re-evaluation without retraining was performed to calculate in- 
traobserver agreement. 

Statistical analysis 

We used chi-square and Fisher exact tests for categorical 
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comparison of the data. A p<0.05 was considered statistically 
significant. Intraobserver and interobserver accordance was 
evaluated by the k value, which varies from -1 (complete dis- 
agreement) to +1 (complete agreement), with zero indicating 
agreement by chance only, k Values >0.8 indicated almost per- 
fect agreement, 0.6 to 0.79 indicated substantial agreement, 0.4 
to 0.59 indicated moderate agreement, 0.2 to 0.39 indicated 
fair agreement, and <0.2 indicated slight or very poor agree- 
ment.' 

The K value was calculated for endoscopists in each group 
based on the VS pattern and histology prediction. The diag- 
nostic accuracy of prediction was calculated for each endosco- 
pist according to the fmal histology. AH analyses were per- 
formed using R version 2. 15. 1. 

RESULTS 

All seven participants completed both study stages. Baseline 
characteristics and endoscopic features of the 47 lesions are 
presented in Table 1. Of the 47 lesions, adenoma was the pre- 
operative diagnosis for 35 lesions and carcinoma for 12, based 
on white light endoscopy. Lesion size was significantly larger 
in carcinomas than in adenomas (|>=0.001). Ulcer lesion 
(p=0.002) and red color (p=0.001) were significant predictive 
factors for carcinoma, but depressed lesions did not have sig- 
nificant predictive value (Table 1). 

Intraobserver data 

The K value for individual intraobserver variability varied 



from 0.51 to 0.90 (Table 2). The mean k value for intraobserver 
variability based on the Yao VS classification was 0.69 (experts, 
0.74; trainees, 0.64) (Table 3). The mean k values for both the 
expert and trainee groups indicated substantial agreement. 

Interobserver data 

The mean k value for interobserver agreement based on VS 
classification was 0.45 (experts, 0.59; trainees, 0.43) in the first 
review and 0.40 (experts, 0.39; trainees, 0.38) in the second re- 
view. The mean k value for interobserver agreement was 0.42 
(experts, 0.49; trainees, 0.40) (Table 4). The mean k values for 
both the expert and trainee group indicated moderate agree- 
ment. 

Diagnostic accuracy of the VS classification 

The diagnostic accuracy of the VS classification was calcu- 
lated for each investigator according to the final histology. 
However, diagnostic accuracy was diflicult to assess because of 
the retrospective analysis of the selected static images. The di- 
agnostic accuracy of VS classification was 72.2% (experts, 
69.2%; trainees, 75.3%) in the first review and 70.3% (experts, 
69.2%; trainees, 71.5%) in the second review (Table 5). The 
mean diagnostic accuracy of the VS classification was 71.3% 
(experts, 69.2%; trainees, 73.4%). 

DISCUSSION 

An accurate diagnosis of gastric lesions is essential to deter- 
mine optimal treatment. However, endoscopy with C-WLI 



Table 1 . Analysis of Baseline Characteristics and Endoscopic Features in Adenomas and Carcinomas 



Characteristic Adenoma («=35) Carcinoma («=12) p-value 



Sex, male/female (% male) 


28/7 (80.0) 


9/3 (75.0) 


0.700"' 


Age, yr 


60.8±9.0 


66.2±8.8 


0.077 


Diameter of lesion, mm 


13.6±4.1 


18.8±5.5 


0.001 


Lesion type 








Ulcer, with/without (% with) 


2/33 (5.7) 


6/6 (50.0) 


0.002"' 


Type, depressed/not depressed (% depressed) 


2/33 (5.7) 


2/10 (16.7) 


0.266"' 


Color, red/not red (% red) 


9/26 (25.7) 


10/2 (83.3) 


0.001"' 


''Fisher exact test. 








Table 2. The Intraobserver Agreement for Magnifying Endoscopy 
with Narrow Band Imaging Using Yao Vessel Plus Surface Clas- 
sification 


Table 3. The Intraobserver Agreement for Magnifying Endoscopy 
with Narrow Band Imaging Using Yao Vessel Plus Surface Clas- 
sification 


El E2 Tl T2 T3 T4 T5 




E T 


Total 


MVorMS 0.851 0.625 0.902 0.683 0.592 0.510 0.529 


MV or MS 


0.738 0.643 


0.691 


MV 0.628 0.719 0.415 0.619 0.551 0.384 0.465 


MV 


0.674 0.487 


0.580 


MS 0.558 0.547 0.705 0.723 0.325 0.629 0.563 


MS 


0.553 0.589 


0.571 


MVandMS 0.553 0.642 0.393 0.554 0.357 0.366 0.384 


MVandMS 


0.598 0.411 


0.504 


E, expert; T, trainee; MV, microvascular; MS, microsurface. 


E, expert; T, trainee; MV, microvascular; MS, microsurface. 
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Table 4. The Interobserver Agreement for Magnifying Endoscopy with Narrow Band Imaging Using Yao Vessel Plus Surface Classification 






1st 






2nd 






Total (mean) 






E1-E2'' 


T1-T5''' 


Total"' 


E1-E2'' 


T1-T5'" 


Total'" 


E1-E2'' 


T1-T5'" 


Total''' 


MV or MS 


0.592 


0.429 


0.448 


0.391 


0.379 


0.399 


0.492 


0.404 


0.424 


MV 


0.576 


0.365 


0.378 


0.506 


0.292 


0.347 


0.541 


0.329 


0.363 


MS 


0.350 


0.386 


0.375 


0.359 


0.390 


0.391 


0.355 


0.388 


0.383 


MV and MS 


0.286 


0.272 


0.260 


0.259 


0.231 


0.249 


0.273 


0.252 


0.255 


E, expert; T, trainee; MV, microvascular; MS, microsurface. 
^'Cohen k; '"'Fleiss k. 














Table 5. The Diagnostic Accuracy of Magnifying Endoscopy with Narrow Band Imaging Using Yao Vessel Plus Surface Classification 






1st 






2nd 






Total (mean) 






E 


T 


Total 


E 


T 


Total 


E 


T 


Total 


Sensitivity 


100.0 


86.7 


93.3 


93.4 


82.7 


88.0 


96.7 


84.7 


90.7 


Specificity 


54.7 


70.0 


62.4 


57.8 


66.3 


62.0 


56.3 


68.1 


62.2 


Accuracy 


69.2 


75.3 


72.2 


69.2 


71.5 


70.3 


69.2 


73.4 


71.3 


PPV 


51.0 


59.8 


55.4 


53.0 


57.5 


55.3 


52.0 


58.7 


55.3 



E, expert; T, trainee; PPV, positive predictive value. 

and pretreatment forceps biopsy is insufficient to obtain an ac- 
curate diagnQsis.'''" ME-NBI is a powerful tool for characteriz- 
ing the gastric mucosal surface because it enables the visualiza- 
tion of the precise microanatomies of both the MV and MS 
patterns of gastric mucosal lesions.' 

Yao et al." reported that the negative predictive value of the 
demarcation line was 100% and that an irregular MV pattern 
gave a diagnostic accuracy of 98.7% for small, flat gastric can- 
cers in blinded prospective studies. Several studies have shown 
the usefulness of microscopic capillaries seen by ME-NBI for 
predicting gastric neoplasia among superficial depressed, flat, 
or elevated early gastric neoplastic lesions.""" 

Endoscopic diagnosis is subjective and therefore often shows 
relatively low inter-observer concordance. Our search of the lit- 
erature revealed few studies of the intraobserver and interob- 
server variability of ME-NBI in diagnosing superficial gastric 
lesions. In this study, we performed a retrospective clinical in- 
vestigation of the interobserver and intraobserver variability of 
experts and trainees analyzing ME-NBI images. A k value >0.4 
was considered acceptable.' 

A previous study of ME-NBI images using clear, static im- 
ages'^ is similar to our study. However, the criteria used to di- 
vide endoscopists into experts and trainees differed, and the 
proportion of depressed lesions was lower than that in our 
study. The lacking point is that which confirmed only agree- 
ment without comparison of C-WLI and ME-NBI. 

In this study, the mean k for the intraobserver agreement 
was 0.69 (substantial agreement) and the mean k for interob- 
server agreement was 0.42 (moderate agreement). The Los 
Angeles classification is widely used for endoscopic gastro- 
esophageal reflux disease and this classification has very low 



interobserver variability."'" In comparison, the results from 
this study indicated a relatively reliable agreement. 

In this retrospective study, the mean diagnostic accuracy of 
the VS classification was 71.3%. Considering that we made no 
distinction between depressed or nondepressed lesions and 
had a high proportion of nondepressed lesions, this result is 
acceptable. Using the Yao et al.^ VS classification, we obtained 
reliable results with only brief instruction on image analysis. 
Interobserver agreement was not significantiy different for ex- 
perts and trainees. 

This study had several limitations. Table 1 shows that C- 
WLI (diameter, mucosal colors, and ulcer) was highly accurate 
at predicting histology results. Thus, ME-NBI might not be 
necessary. However, this retrospective study evaluated observ- 
er agreement by random selection; therefore, comparing the 
accuracy of C-WLI and ME-NBI is not relevant. A multicenter 
study with many experts found that ME-NBI had a higher ac- 
curacy and specificity than C- WLI.'^ 

Our results compared the mean k values for interobserver 
agreement (Table 4) and found that the mean k value sUghtiy 
decreased in the second test, even without retraining. Agree- 
ment is considered to generally increase with training. The re- 
duced agreement might be because ME-NBI is uncommon 
and endoscopists have little training using this method. In- 
creased ME-NBI use and education might increase agreement. 
Contrary to expectations, the trainees had higher mean diag- 
nostic accuracy than experts according to Table 5. Additional 
analysis may be necessary. Agreement for determination of a 
positive or negative diagnosis based on VS classification was 
higher than agreement for determinations made based on MV 
and MS patterns. This finding suggested that simultaneously 



77 



Observer Variability of IVIE-NBI 



using both MS and MV patterns might increase agreement, 
although the possibiUty that higher agreement might have 
been due to simplified positive and negative categories cannot 
be ruled out. 

Finally, there are few experienced endoscopists, and intrao- 
bserver and interobserver agreements were not satisfactory 
even in the expert group. We considered this result to be due 
to the fact that even if they are experts, there are many differ- 
ences in the frequency of ME-NBI use. 

This study was a single center retrospective study with sev- 
eral limitations. Future multicenter prospective and well-de- 
signed studies are needed. 

In conclusion, by using the Yao VS classification, we ob- 
tained reliable results from experts and trainees after brief in- 
struction. The classification provided an objective measure of 
ME-NBI for trainees who were not familiar with ME-NBI. 
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